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ABSTRACT 

Game  theory  provides  a  framework  for  modeling  a  wide  range  of  security  and 
defense  problems.  This  project  focuses  on  Stackelberg  strategies,  which 
are  optimal  when  one  player  can  commit  to  a  (possibly  randomized)  strategy 
before  the  other  player  moves.  For  example,  a  defensive  unit  can  commit  to 
a  randomized  patrolling  pattern  to  deter  attacks. 

This  project  explores  new  approaches  for  efficiently  computing  Stackelberg 
strategies  in  realistic  security  domains  with  exponentially  large  strategy 
spaces.  Potential  impacts  of  this  research  include  increased  ability  to 
compute  optimal  strategies  for  security  and  defense  scenarios. 

Notable  contributions  of  the  project  include:  (1)  New  algorithms  and 
complexity  results  for  security  games  as  well  as  unrestricted  games.  The 
algorithms  allow  us  to  solve  new  classes  of  games  efficiently;  the 
complexity  results  indicate  that  other  methods  are  needed  for  richer 
classes  of  games.  (2)  A  deeper  understanding  of  the  role  of  commitment  and 
the  assumption  that  the  attacker  can  observe  the  defender's  strategy. 

These  results  indicate  that,  in  a  sense,  Stackelberg  strategies  are 
“safe"  to  play  even  when  this  assumption  does  not  hold,  in  some  security 
domains  (but  not  all  —  and  to  address  this  shortcoming,  we  also  provide  a 
methodology  for  other  security  games). 
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Scientific  Progress 


Our  JAIR'11  paper  (Korzhyk,  Yin,  Kiekintveld,  Conitzer,  and  Tambe) 
significantly  extends  our  AAMAS'10  paper.  In  this  paper,  we  investigate 
the  role  of  commitment  and  the  assumption  that  the  attacker  can  observe  the 
defender's  strategy;  without  this  assumption,  we  have  a  simultaneous-move 
game  and  Nash  equilibrium  would  be  a  more  natural  solution  concept  to  use. 

We  prove  that  under  a  natural  restriction  on  the  family  of  games,  defender 
Stackelberg  strategies  must  also  be  Nash  strategies,  and  moreover  that  the 
Nash  equilibria  are  interchangeable.  This  interchangeability  property 
means  that  if  one  player  plays  according  to  one  equilibrium  and  the  other 
player  according  to  another  equilibrium,  the  result  is  guaranteed  to  still 
be  an  equilibrium.  In  general  games,  this  is  not  always  true,  leading  to 
the  dreaded  "equilibrium  selection  problem"  that  a  player  does  not  know 
according  to  which  equilibrium  to  play — but  thanks  to  the 
interchangeability  property,  in  these  security  games  we  need  not  worry 
about  choosing  the  wrong  equilibrium,  and  in  particular  by  the  first  result 
we  can  just  choose  the  Stackelberg  strategy.  Hence,  Stackelberg  strategies 
are  robust  to  changes  in  the  game  model  that  concern  commitment  and 
observability.  We  also  ran  simulations  on  games  that  do  not  satisfy  the 
properties  needed  for  Stackelberg  strategies  to  also  be  Nash  strategies;  the 
simulations  suggest  that  Stackelberg  strategies  are  still  often  Nash 
strategies  in  these  games,  except  when  the  attacker  can  perform  complex 
coordinated  attacks  in  multiple  locations. 

In  an  IJCAI'1 1  paper  (Korzhyk,  Conitzer,  Parr),  we  further  study  this 
problem  of  an  attacker  that  performs  multiple  simultaneous  attacks.  While 
(as  was  shown  in  the  JAIR  paper)  Stackelberg  strategies  are  not  usually 
also  Nash  strategies  in  this  context,  we  show  that  at  least  the 
interchangeability  property  of  Nash  equilibria  is  still  satisfied,  so  one 
still  does  not  need  to  worry  about  which  equilibrium  strategy  is  the 
"right"  one.  We  also  give  a  polynomial-time  algorithm  for  computing  a 
Nash  equilibrium  in  this  context,  which  initializes  the  number  of  defender 
resources  at  zero  and  gradually  increases  them  to  the  desired  number,  all 
the  while  maintaining  an  equilibrium  of  the  game.  On  the  other  hand,  we  show 
that  computing  a  Stackelberg  strategy  is  actually  NP-hard.  (These  results 
were  surprising  to  us,  because,  in  contrast,  in  two-player  normal-form 
games,  computing  a  Stackelberg  strategy  can  be  done  in  polynomial  time, 
whereas  computing  a  Nash  equilibrium  is  PPAD-complete  and  computing  an 
optimal  Nash  equilibrium  is  NP-hard.) 

Of  course,  this  still  does  not  resolve  what  to  do  in  such  games  when  one  is 
not  sure  whether  the  attacker  can  observe  the  mixed  strategy  (and,  hence, 
whether  Stackelberg  or  Nash  is  the  right  model).  Our  JAIR  paper  above  does 
propose  a  game  model  in  which  this  uncertainty  is  modeled  explicitly,  but 
it  does  not  provide  any  algorithm  for  solving  these  games.  In  an  AAMAS'1 1 
paper  (Korzhyk,  Conitzer,  Parr),  we  propose  an  algorithm  for  solving  these 
games  that  uses  Nash  and  Stackelberg  solvers  as  subroutines.  (The 
algorithm  will  work  on  any  game  for  which  such  solvers  are  available.)  We 
show  that  in  simulations  a  small  number  of  calls  to  these  solvers  is 
sufficient  to  solve  the  games. 

In  another  (still  unpublished)  draft  (Letchford,  Korzhyk,  Conitzer),  we 
study,  for  various  classes  of  games  including  security  games,  how  much  can 
be  gained  by  having  the  ability  to  commit  to  a  strategy  before  the  other 
player  moves.  We  find  that  usually  games  can  be  constructed  where  the 
gains  from  commitment  are  extreme,  though  when  taking  an  average  over  many 
randomly  drawn  games,  the  benefits  from  commitment  tend  to  be  much  less 
extreme. 

In  another  AAMAS'1 1  paper  (Jain,  Korzhyk,  Vanek,  Conitzer,  Pechoucek, 
Tambe),  we  study  the  "Mumbai  problem":  in  response  to  the  2008  terrorist 
attacks  on  Mumbai,  the  Mumbai  police  have  started  to  set  up  checkpoints  in 


the  city;  how  can  we  allocate  these  in  a  game-theoretically  optimal  way? 

We  model  this  (for  now)  as  a  zero-sum  game  between  a  defender  and  an 
attacker  on  a  graph,  where  the  defender  chooses  edges  in  the  graph  to 
defend  and  the  attacker  chooses  a  target  and  a  path  to  that  target. 

Crucially,  we  do  allow  the  targets  to  have  varying  values,  which  makes  an 
earlier  exact  approach  inapplicable;  we  also  show  that  an  existing 
approximate  approach  can  be  arbitrarily  suboptimal.  We  present  the  RUGGED 
(Randomization  in  Urban  Graphs  by  Generating  strategies  for  Enemy  and 
Defender)  algorithm,  which  uses  column  and  constraint  generation  techniques 
to  incrementally  add  strategies  to  the  game  until  convergence  to  an  optimal 
solution,  and  show  that  it  scales  to  the  southern  part  of  Mumbai. 

In  a  AAAI'1 1  paper  (Conitzer  and  Korzhyk),  we  study  the  computation  of 
Stackelberg  strategies  in  general  normal-form  games.  We  show  that  there  is 
a  close  relationship  between  the  standard  linear  program  for  computing  a 
correlated  equilibrium  of  a  game  (a  fairly  well-known  relaxation  of  the 
concept  of  Nash  equilibrium),  and  the  linear-programming  approach  for 
computing  Stackelberg  strategies.  This  suggests  a  new  linear-programming 
approach  for  computing  Stackelberg  strategies,  and  in  our  simulations  on 
50x50  games  this  new  formulation  is  faster  than  the  standard  approach  that 
involves  solving  multiple  LPs.  Perhaps  more  importantly,  it  gives  a  way  to 
extend  this  approach  to  more  than  two  players  —  specifically,  to  settings 
with  a  single  leader  and  an  arbitrary  number  of  followers.  This 
generalization  to  more  than  two  players  does  require  that  the  leader  can 
send  signals  to  the  followers.  (Similarly,  in  a  correlated  equilibrium,  a 
mediator  sends  signals  to  all  the  players.) 
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Efficient  Algorithms  for  Computing  Stackelberg 
Strategies  in  Security  Games 
Final  Report 

Vincent  Conitzer  and  Ronald  Parr 
Duke  University 

1  Statement  of  the  problem  studied 

Game  theory  provides  a  framework  for  modeling  a  wide  range  of  security  and 
defense  problems.  This  project  focuses  on  Stackelberg  strategies,  which  are  opti¬ 
mal  when  one  player  can  commit  to  a  (possibly  randomized)  strategy  before  the 
other  player  moves.  For  example,  a  defensive  unit  can  commit  to  a  randomized 
patrolling  pattern  to  deter  attacks. 

This  project  explores  new  approaches  for  efficiently  computing  Stackelberg 
strategies  in  realistic  security  domains  with  exponentially  large  strategy  spaces. 
Potential  impacts  of  this  research  include  increased  ability  to  compute  optimal 
strategies  for  security  and  defense  scenarios. 

Notable  contributions  of  the  project  include: 

1.  New  algorithms  and  complexity  results  for  security  games  as  well  as  un¬ 
restricted  games.  The  algorithms  allow  us  to  solve  new  classes  of  games 
efficiently;  the  complexity  results  indicate  that  other  methods  are  needed 
for  richer  classes  of  games. 

2.  A  deeper  understanding  of  the  role  of  commitment  and  the  assumption 
that  the  attacker  can  observe  the  defender’s  strategy.  These  results  indi¬ 
cate  that,  in  a  sense,  Stackelberg  strategies  are  “safe”  to  play  even  when 
this  assumption  does  not  hold,  in  some  security  domains  (but  not  all  - 
and  to  address  this  shortcoming,  we  also  provide  a  methodology  for  other 
security  games). 

2  Summary  of  the  most  important  results 

Our  JAIR’fl  paper  [5]  significantly  extends  our  AAMAS’10  paper.  In  this 
paper,  we  investigate  the  role  of  commitment  and  the  assumption  that  the  at¬ 
tacker  can  observe  the  defender’s  strategy;  without  this  assumption,  we  have  a 
simultaneous-move  game  and  Nash  equilibrium  would  be  a  more  natural  solu¬ 
tion  concept  to  use.  We  prove  that  under  a  natural  restriction  on  the  family 
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of  games,  defender  Stackelberg  strategies  must  also  be  Nash  strategies,  and 
moreover  that  the  Nash  equilibria  are  interchangeable.  This  interchangeability 
property  means  that  if  one  player  plays  according  to  one  equilibrium  and  the 
other  player  according  to  another  equilibrium,  the  result  is  guaranteed  to  still 
be  an  equilibrium.  In  general  games,  this  is  not  always  true,  leading  to  the 
dreaded  “equilibrium  selection  problem”  that  a  player  does  not  know  according 
to  which  equilibrium  to  play — but  thanks  to  the  interchangeability  property,  in 
these  security  games  we  need  not  worry  about  choosing  the  wrong  equilibrium, 
and  in  particular  by  the  first  result  we  can  just  choose  the  Stackelberg  strategy. 
Hence,  Stackelberg  strategies  are  robust  to  changes  in  the  game  model  that 
concern  commitment  and  observability.  We  also  ran  simulations  on  games  that 
do  not  satisfy  the  properties  needed  for  Stackelberg  strategies  to  also  be  Nash 
strategies;  the  simulations  suggest  that  Stackelberg  strategies  are  still  often 
Nash  strategies  in  these  games,  except  when  the  attacker  can  perform  complex 
coordinated  attacks  in  multiple  locations. 

In  an  IJCAI’ll  paper  [3],  we  further  study  this  problem  of  an  attacker  that 
performs  multiple  simultaneous  attacks.  While  (as  was  shown  in  the  JAIR  pa¬ 
per)  Stackelberg  strategies  are  not  usually  also  Nash  strategies  in  this  context, 
we  show  that  at  least  the  interchangeability  property  of  Nash  equilibria  is  still 
satisfied,  so  one  still  does  not  need  to  worry  about  which  equilibrium  strategy  is 
the  “right”  one.  We  also  give  a  polynomial-time  algorithm  for  computing  a  Nash 
equilibrium  in  this  context,  which  initializes  the  number  of  defender  resources  at 
zero  and  gradually  increases  them  to  the  desired  number,  all  the  while  maintain¬ 
ing  an  equilibrium  of  the  game.  On  the  other  hand,  we  show  that  computing  a 
Stackelberg  strategy  is  actually  NP-hard.  (These  results  were  surprising  to  us, 
because,  in  contrast,  in  two-player  normal-form  games,  computing  a  Stackelberg 
strategy  can  be  done  in  polynomial  time,  whereas  computing  a  Nash  equilibrium 
is  PPAD-complete  and  computing  an  optimal  Nash  equilibrium  is  NP-hard.) 

Of  course,  this  still  does  not  resolve  what  to  do  in  such  games  when  one  is  not 
sure  whether  the  attacker  can  observe  the  mixed  strategy  (and,  hence,  whether 
Stackelberg  or  Nash  is  the  right  model).  Our  JAIR  paper  above  does  propose 
a  game  model  in  which  this  uncertainty  is  modeled  explicitly,  but  it  does  not 
provide  any  algorithm  for  solving  these  games.  In  an  AAMAS’ll  paper  [4],  we 
propose  an  algorithm  for  solving  these  games  that  uses  Nash  and  Stackelberg 
solvers  as  subroutines.  (The  algorithm  will  work  on  any  game  for  which  such 
solvers  are  available.)  We  show  that  in  simulations  a  small  number  of  calls  to 
these  solvers  is  sufficient  to  solve  the  games. 

In  another  (still  unpublished)  draft  [6] ,  we  study,  for  various  classes  of  games 
including  security  games,  how  much  can  be  gained  by  having  the  ability  to  com¬ 
mit  to  a  strategy  before  the  other  player  moves.  We  find  that  usually  games 
can  be  constructed  where  the  gains  from  commitment  are  extreme,  though  when 
taking  an  average  over  many  randomly  drawn  games,  the  benefits  from  com¬ 
mitment  tend  to  be  much  less  extreme. 

In  another  A  AM  AS ’ll  paper  [2],  we  study  the  “Mumbai  problem”:  in 
response  to  the  2008  terrorist  attacks  on  Mumbai,  the  Mumbai  police  have 
started  to  set  up  checkpoints  in  the  city;  how  can  we  allocate  these  in  a  game- 
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theoretically  optimal  way?  We  model  this  (for  now)  as  a  zero-sum  game  between 
a  defender  and  an  attacker  on  a  graph,  where  the  defender  chooses  edges  in  the 
graph  to  defend  and  the  attacker  chooses  a  target  and  a  path  to  that  target. 
Crucially,  we  do  allow  the  targets  to  have  varying  values,  which  makes  an  ear¬ 
lier  exact  approach  inapplicable;  we  also  show  that  an  existing  approximate 
approach  can  be  arbitrarily  suboptimal.  We  present  the  RUGGED  (Random¬ 
ization  in  Urban  Graphs  by  Generating  strategies  for  Enemy  and  Defender) 
algorithm,  which  uses  column  and  constraint  generation  techniques  to  incre¬ 
mentally  add  strategies  to  the  game  until  convergence  to  an  optimal  solution, 
and  show  that  it  scales  to  the  southern  part  of  Mumbai. 

In  a  AAAI’ 11  paper  [1],  we  study  the  computation  of  Stackelberg  strategies 
in  general  normal-form  games.  We  show  that  there  is  a  close  relationship  be¬ 
tween  the  standard  linear  program  for  computing  a  correlated  equilibrium  of  a 
game  (a  fairly  well-known  relaxation  of  the  concept  of  Nash  equilibrium),  and 
the  linear-programming  approach  for  computing  Stackelberg  strategies.  This 
suggests  a  new  linear-programming  approach  for  computing  Stackelberg  strate¬ 
gies,  and  in  our  simulations  on  50x50  games  this  new  formulation  is  faster  than 
the  standard  approach  that  involves  solving  multiple  LPs.  Perhaps  more  im¬ 
portantly,  it  gives  a  way  to  extend  this  approach  to  more  than  two  players  - 
specifically,  to  settings  with  a  single  leader  and  an  arbitrary  number  of  followers. 
This  generalization  to  more  than  two  players  does  require  that  the  leader  can 
send  signals  to  the  followers.  (Similarly,  in  a  correlated  equilibrium,  a  mediator 
sends  signals  to  all  the  players.) 

3  Personnel  funded 

Besides  the  Pis  (Conitzer  and  Parr),  two  Duke  Computer  Science  Ph.D.  students 
have  been  funded  from  this  award:  Drnytro  (Dima)  Korzhyk  and  Joshua  (Josh) 
Letchford.  Both  are  currently  expected  to  complete  their  Ph.D.  dissertations 
on  topics  closely  related  to  this  grant  in  2013. 
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